Skip to main content

Reply to this post | Go Back
View Post [edit]

Poster: stbalbach Date: Sep 4, 2007 7:36am
Forum: texts Subject: Internet Archive vs. Google Books

I came across this quote recently on a blog, in reference to The Open Library:

"At some point Brewster & Co at the Internet Archive realized that they weren't going to be able to "win" the scanning project, so they decided they'd rather be the go to place for people to FIND the results of others' scanning projects."

Source: http://www.metafilter.com/62983/The-Open-Library#1767820

I hope this is not true, Internet Archive has already
"won" the scanning game, it is the leader. It may not have as many raw number of books as Google, but Google just sucks.

1) The whole point of scanning old books is to be able to read them in their native format. Otherwise, we have Project Gutenberg and/or OCR text files. Most of these old books were made using Letter Press which has an organic beauty and quality that surpasses modern books. If you were to buy a newly printed Letter Press book today it would cost $100's of dollars a copy (there are still some publishers).

2) In order to capture Letter Press, and the illustrations, a high quality color scan is needed. Google has completely missed the boat, focusing on content only.

3) Even on content, Google has screwed up, the scans are often illegible and unusable - not just a few pages, but entire chapters are smeared, blurry, chopped and missing.

Google is such a shame, they had the opportunity to do something really important and they missed out on the details.

Reply to this post
Reply [edit]

Poster: EmilPer Date: Oct 19, 2007 11:41pm
Forum: texts Subject: Re: Internet Archive vs. Google Books

Google also makes it very hard to use the book.Getting the bibliographic data, for example, or making a list of books that can be imported into a word processor, or making a link (they used to have the OCLC number in the url, now it's hidden in a link in the book details page).

Archive.org is not perfect either: I have not figured yet how to search in the full text. Are they trying to persuade us that's it is better to harvest interesting titles and index them at home?

The French did a good job with Gallica (http://gallica.bnf.fr) at the beginning (about 1999), but like most things European, after a good start seems to have slowed down, and will probably end in a shouting match with the Germans, the way Galileo and the EUropean search engine ended.

Reply to this post
Reply [edit]

Poster: Cedric Aubel Date: Nov 26, 2007 12:07pm
Forum: texts Subject: Re: Internet Archive vs. Google Books

About Gallica, it is speeding up with Gallica2 (just opened) !
They are going to scan a LOT more.

Au revoir...

Reply to this post
Reply [edit]

Poster: EmilPer Date: Nov 26, 2007 12:41pm
Forum: texts Subject: Re: Internet Archive vs. Google Books

Gallica2 does look good indeed. Much faster than the old (or at least appears so), and better interface.

Reply to this post
Reply [edit]

Poster: Cedric Aubel Date: Nov 26, 2007 2:52pm
Forum: texts Subject: Re: Internet Archive vs. Google Books

But the most important thing is the speeding up:

Until 2005: ~ 6 000 docs scanned / year
Since 2007: ~ 100 000 docs scanned / year (they will begin show up on Gallica2 on February/March 2008)

More infos (in french) at: http://www.bnf.fr/pages/catalog/bibliotheque_numerique.htm

Reply to this post
Reply [edit]

Poster: EmilPer Date: Nov 26, 2007 3:04pm
Forum: texts Subject: Re: Internet Archive vs. Google Books

100 000 is a lot, but the most encouraging fact was seeing an internationalized interface at gallica2.bnf.fr ... which means it's more about books than about "francophonie ueber alles": I am not American, French or German, and the recent shrillness of the "dialogue" between the bureaucrats/propaganda machines of those three states make me a little queasy.

thank you for pointing me to gallica2